Search CORE

8 research outputs found

Do optimization methods in deep learning applications matter?

Author: Kiran Mariam
Ozyildirim Buse Melis
Publication venue: eScholarship, University of California
Publication date: 28/02/2020
Field of study

With advances in deep learning, exponential data growth and increasing model complexity, developing efficient optimization methods are attracting much research attention. Several implementations favor the use of Conjugate Gradient (CG) and Stochastic Gradient Descent (SGD) as being practical and elegant solutions to achieve quick convergence, however, these optimization processes also present many limitations in learning across deep learning applications. Recent research is exploring higher-order optimization functions as better approaches, but these present very complex computational challenges for practical use. Comparing first and higher-order optimization functions, in this paper, our experiments reveal that Levemberg-Marquardt (LM) significantly supersedes optimal convergence but suffers from very large processing time increasing the training complexity of both, classification and reinforcement learning problems. Our experiments compare off-the-shelf optimization functions(CG, SGD, LM and L-BFGS) in standard CIFAR, MNIST, CartPole and FlappyBird experiments.The paper presents arguments on which optimization functions to use and further, which functions would benefit from parallelization efforts to improve pretraining time and learning rate convergence

arXiv.org e-Print Archive

eScholarship - University of California

Levenberg–Marquardt multi-classification using hinge loss function

Author: Ozyildirim Buse Melis,
Publication venue
Publication date: 08/12/2022
Field of study

Ezid

Recommended from our members

Levenberg-Marquardt multi-classification using hinge loss function.

Author: Kiran Mariam
Ozyildirim Buse Melis
Publication venue: eScholarship, University of California
Publication date: 01/11/2021
Field of study

Incorporating higher-order optimization functions, such as Levenberg-Marquardt (LM) have revealed better generalizable solutions for deep learning problems. However, these higher-order optimization functions suffer from very large processing time and training complexity especially as training datasets become large, such as in multi-view classification problems, where finding global optima is a very costly problem. To solve this issue, we develop a solution for LM-enabled classification with, to the best of knowledge first-time implementation of hinge loss, for multiview classification. Hinge loss allows the neural network to converge faster and perform better than other loss functions such as logistic or square loss rates. We prove our method by experimenting with various multiclass classification challenges of varying complexity and training data size. The empirical results show the training time and accuracy rates achieved, highlighting how our method outperforms in all cases, especially when training time is limited. Our paper presents important results in the relationship between optimization and loss functions and how these can impact deep learning problems

eScholarship - University of California

Recommended from our members

Do optimization methods in deep learning applications matter?

Author: Kiran Mariam
Ozyildirim Buse Melis
Publication venue: eScholarship, University of California
Publication date: 28/02/2020
Field of study

eScholarship - University of California

A cluster based approach to reduce pattern layer size for generalized regression neural network

Author: Kartal Serkan
Oral Mustafa
Ozyildirim Buse Melis
Publication venue: 'LookUs Bilisim A.S.'
Publication date: 01/01/2018
Field of study

WOS: 000446742400009Generalized Regression Neural Network (GRNN), is a radial basis function based supervised learning type Artificial Neural Network (ANN) which is commonly used for data predictions. In addition to its easy modelling structure, being fast and producing accurate results are the other strong features of it On the other hand, GRNN employs a neuron in pattern layer for each data sample in training data set. Therefore, for huge data sets pattern layer size increases proportional to the number of samples in training data set, memory requirement and computational time also increase excessively. In this study, in order to reduce space and time complexity of GRNN, k-means clustering algorithm which had been used as pre-processor in the literature is utilized and outlier data emergence which affects the performances of previous studies negatively, is prevented by identifying test data located between clusters. Hence, while memory requirement in pattern layer and number of calculations are reduced, negative effect on the performance emerged by the use of clustering algorithm is significantly removed and almost the same prediction performances to that of standard GRNN are achieved by using 90% less training samples

Çukurova University Institutional Repository

Pattern Layer Reduction for a Generalized Regression Neural Network by Using a Self–Organizing Map

Author: Kartal Serkan
Oral Mustafa
Ozyildirim Buse Melis
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2018
Field of study

In a general regression neural network (GRNN), the number of neurons in the pattern layer is proportional to the number of training samples in the dataset. The use of a GRNN in applications that have relatively large datasets becomes troublesome due to the architecture and speed required. The great number of neurons in the pattern layer requires a substantial increase in memory usage and causes a substantial decrease in calculation speed. Therefore, there is a strong need for pattern layer size reduction. In this study, a self-organizing map (SOM) structure is introduced as a pre-processor for the GRNN. First, an SOM is generated for the training dataset. Second, each training record is labelled with the most similar map unit. Lastly, when a new test record is applied to the network, the most similar map units are detected, and the training data that have the same labels as the detected units are fed into the network instead of the entire training dataset. This scheme enables a considerable reduction in the pattern layer size. The proposed hybrid model was evaluated by using fifteen benchmark test functions and eight different UCI datasets. According to the simulation results, the proposed model significantly simplifies the GRNN’s structure without any performance loss

Çukurova University Institutional Repository

Directory of Open Access Journals

Biosorption Modeling with Multilayer Perceptron for Removal of Lead and Zinc Ions Using Crab Shell Particles

Author: A. Esposito
A. Hernandez
A. Selatnia
A.S. Özcan
B. Balci
B. Volesky
B. Volesky
C. Yan
C. Yang
Chen. Bor-Yann
D. Knorr
D.S. Kim
E. Galli
E. Oguz
F. Luo
G. McKay
G. Uslu
H. Freundlich
H. Ucuna
I. Langmuir
I.B. Rae
K. Vijayaraghavan
K. Vijayaraghavan
K. Vijayaraghavan
K. Vijayaraghavan
K. Vijayaraghavan
K. Yetilmezsoy
L.J.A. Gerringa
L.O. Franco
M. Bittelli
M. Muhaemin
M.Y. Lee
N.C.M. Gomes
O. Gyliene
O. Redlich
Ozyildirim. Buse Melis
P.R. Puranik
R. Senthilkumar
R. Senthilkumar
R.H.S.F. Vieira
S. Dahiya
S. Tunali
T.A. Davis
V.M. Janakiraman
Y.S. Ho
Z. Reddad
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref